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A METHOD FOR PROCESSING A DIGITAL VIDEO AUDIO SIGNAL 

BACKGROUND OF THE INVENTION 

Field of the Invention 

The present invention relates to a method of decoding an audio signal 
included in a Digital Video (DV) data stream. The DV format is commonly used to store 
video and audio sourced from domestic camcorders. The data format is adapted to store 
both data related to the video signal and the audio signal. Both signals are generally 
decoded separately. 

Description of the Prior Art 

The audio part of DV data is formatted to include: 

■ Audio Pre-amble 

■ 14 Data-Sync blocks, each including: 

■ a Sync area of 2 bytes 

■ an ID code of 3 bytes 

■ a data area of 85 bytes 

■ Audio Post-amble 

The format of the data-sync blocks is shown in Figure 1 . 

When the audio data is encoded and stored in DV format, audio samples and 
data are shuffled over different tracks and data-sync blocks within an audio frame. Firstly, 
audio data is shuffled, and then dummy data is added. The position of the n^ audio sample 
is determined from the equations (1) - (6) which follow: 
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=[L"/3j+2*(«%3)]%r 
t 2 = [\n 1 3 J + 2 * («%3)] %r + r 
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In relation to the above equations, R is the set of all real numbers, Z is the 
set of all integers (positive and negative), N is the set of all natural numbers, ti and t 2 are 
the track numbers for channel 1 and channel 2 respectively, S\ is the sync block number, 
5 and bi is the byte position within the DIF block. A DIF (Digital Interface Frame) block is a 
sub-part of a DV frame. A DV frame includes either 90 or 108 DIF frames depending on 
whether the system is 525/60 or 625/50. Figure 5 illustrates the relationship between a DV 
frame and a DIF block, where the dotted box represents a DV frame, while each individual 
row of data samples represents a DIF block. 

10 With regard to the notation used in this specification, [a,b] indicates a range 

inclusive of both a and b; (a,b] indicates a range exclusive of a, but inclusive of b; and 
[a,b) indicates a range inclusive of a, but exclusive of b. 

If x is a real number (x e R), then [x] indicates the largest integer that is < x, 
and x%y indicates the remainder of the division of x byy, where x,y e N. By number 

15 theory, x%y e [0y). 

It can be seen from equations (4) - (6) that bytes belonging to the same 
sample are distributed consecutively within the same DIF block. Samples from different 
channels but with the same indices have the same sync block number and byte position but 
different track numbers. The relationship between sync-blocks and tracks is illustrated in 

20 Table 1 , which shows how they are related to the sample index and the constants T, Ki, K 2 , 
and B. 
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1> * 

The coding process involves a shuffling operation which maps from the 
PCM (Pulse Code Modulation) domain to the DV domain. Figure 2 shows a top level 
block diagram of the mapping operation, showing how the raw PCM data is mapped on the 
basis of tj, t2, Si and b\ into DV format. Figure 3 shows a more detailed view of the 
5 shuffling of particular samples, in which the left hand side shows a PCM frame with data 
samples D 0 ...Dn. The dotted rectangle on the right hand side shows a DV frame. For each 
PCM sample, D n , its index, n, is used as the input to the shuffling equations (1) to (6) to 
determine its corresponding position in the DV frame. That is, (ti, si, bi) = f(n) => DV [ti] 
[ Sl ] [b,] = PCM [n]=D„. 

10 The values T, B, Kj and K2 are system dependent and are summarized in the 

table below, along with the constants CI and C2. 
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Table I 



15 The numbers in the System column of the above table refer to the number of 

lines in the video system and the refresh rate. So, 525/60 is a 525 line TV system with a 
60Hz refresh rate. 

The DV audio signal is decoded to enable the audio to be reproduced by 
playback equipment, such as a video cassette player. If the shuffled coded data may be 
20 represented as (ti, t 2 , sj, b\) = f(n), then the reverse mapping f 1 may be considered to 
provide the correct order of data. This concept is shown in Figure 4. 

However, this concept is not generally possible in practice, as the shuffling 
process involves modular and non-linear operations, such as [xj , which result in a one-to- 
many reverse relationship. It is therefore not generally possible to easily find a suitable 
2 5 reverse mapping f 1 . 
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Prior art DV audio decoders therefore generally operate in one of two 
known ways. The first way involves the creation of a Look Up Table (LUT) in which a 
mapping relationship between received data and the required output data is pre-computed 
and stored to enable the received data to be re-formatted accordingly. 
5 This method includes the following steps: 

■ Every element in the table is initialized to -I; 

■ For every n € [0,N], computer (ti, Si, bi) using the shuffling 
equations (1) - (6); 

■ Store values for n in a LUT with index [t ls si, bi], i.e., LUT [ti] [si] 
10 [bj-n; 

■ For any incoming shuffled data byte, determine its position in the 
raw PCM data: 

■ IF LUT [ti] [s,] [b,] = -1 THEN discard value, 

■ ELSE PCM [LUT [t,] [s,] [b,] ] = DV [tj [si] [b,] 

1 5 The major disadvantage of this particular method is the large amount of 

memory required to store the LUT. Since the constants (T, Ki, K 2 and B) involved in the 
shuffling equations can be of different values depending on whether it is a 525/60 or 
625/50 system, or a 2 or 4-channel system, four separate look-up tables are required. Each 
LUT is similar in size to a DV frame. 

20 The second method involves receiving and buffering an entire DV audio 

frame, which is then analyzed so that the received data can be decoded on the basis of the 
analysis. 

Using this method, there is no requirement to explicitly determine the 
reverse mapping f \ but it has the drawback that an entire DV audio frame has to be 
25 buffered or stored before decoding can start. This is because a sample occurring in the 
very first position of the raw PCM data may be shuffled to a position very late in the DV 
frame. Therefore, this technique may only be used once a complete DV audio frame is 
available. 
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To use this method, values of (ti, si, b\) are calculated for n = 0 to N. Le„ 
(t,, s,, bi) = f(n). Then PCM[n] = DV [t,] [s,] [b,] 

The prior art methods are problematic in that they both require relatively 
large amounts of memory in order to either store the LUT or buffer the received signal for 
5 further analysis. The methods themselves are relatively straightforward to implement, but 
the memory requirements render them undesirable in practical systems. 

BRIEF SUMMARY OF THE PRESENT INVENTION 

In a first broad form, one embodiment of the present invention provides 
method of decoding audio data, encoded in multiple DIF blocks in a Digital Video (DV) 
10 data stream, and outputting said audio data as a PCM frame, including the following steps: 

(i) fetching a single Digital Interface Frame (DIF) block from the DV 

data stream; 

(ii) de-shuffling a first byte in the single DIF block to determine its 
index (n) in the PCM frame; 

1 5 (iii) repeating step (ii) until the last byte in the single DIF block is 

processed; 

(iv) writing the de-shuffled data into the PCM frame for output if the 
present DIF block is the last in the present DV frame; 

(v) repeat steps (i) to (iv). 

20 By needing only a single DIF block in order to de-shuffle the received data, 

embodiments of the present invention offer advantages over prior an solutions which 
require receipt of as entire DV frame consisting of many tens of DIF blocks, or storage of 
large LUTs, before de-shuffling can begin. 

Preferably, the index (n) of a particular data sample in the output PCM 
25 frame is dependent on parameters of the DV data. 

Preferably, the parameters include: 

■ track number (t) 

■ sync block number (s) 
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■ byte position within the DIF block (b) 
Preferably, for the first DIF block of a new frame, t, s and a DIF block 
counter are set to zero. 

Preferably, s is incremented by 1 each time a new DIF block is received, and 
5 is reset to zero every nine DIF blocks. 

Preferably, t is incremented by 1 every nine DIF blocks. 
Preferably, the DV data may be encoded to one of a plurality of different 
video systems, such as 525/60 (2-channel or 4-channel) or 625/50 (2 -channel or 4- 
channel). 

10 Preferably, each different video system may be characterized by several 

different constants used in the encoding and decoding of data, these constants being: 
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Preferably, the de-shuffling of data in the single DIF block is performed 



according to the de-shuffling equation: 

« = /~ l ( r i" s i»*i) 
= ATjXj +K 2 x 2 +c 

15 = K x {p x I b)+ K 2 (s } %3)+ (m + T + /, - 2 * / 3_|)* 3 + [j, / 3 J 

u H- 2 *k /3 J)<°' m '=l 
where^ ( , . 

elseif^ -2*^/3 J)>0,m' =0 

where ti, Si, bj are the track, sync block and byte numbers respectively, included in the 
single DIF block, and Ki, K2 and B are constants characterizing a particular coding 
scheme. 
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In one of the prior art methods, shuffling equations are used for de-shuffling 
purposes. In this prior method, values of (ti, si, bi) are calculated for n = 0 to N, i.e., (tj, Si, 
bi) = f(n), which leads to PCM [n], = DV [tj] [s x ] [b,]. 

It is noted that values of n run sequentially, whereas values of (ti, si, bi) do 
not. This means that for a small n, the corresponding DV byte may appear in the very last 
part of the frame. This means that an entire DV frame needs to be buffered in this method, 
resulting in the storage of a large amount of transient data. 

By contrast, embodiments of the present invention use reverse mapping 
relationship f 1 which enables the position in the raw PCM frame to be determined directly 
for any given data byte in the DV frame. 

In a second broad form, the present invention also provides apparatus for 
performing the method of the first broad form of the invention. The apparatus is preferably 
a custom Digital Signal Processor (DSP). 

BRIEF DESCRIPTION OF THE DRAWINGS 
1 5 For a better understanding of the present invention and to understand how 

the same may be brought into effect, the invention will now be described by way of 
example only, with reference to the appended drawings in which: 

Figure 1 shows the format of the audio part of a DV data frame; 

Figure 2 shows the shuffling of raw PCM data to encode it as part of a DV 

20 data stream; 

Figure 3 shows how a PCM frame is shuffled according to the shuffling 
equations to produce DIF blocks within the DV data frame; 

Figure 4 shows the de-shuffling of DV data to produce PCM data; 
Figure 5 shows how DV data is de-shuffled according to de-shuffling 
25 equations and output as PCM data; and 

Figure 6 shows a flowchart detailing the steps in the de-shuffling process. 
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DETAILED DESCRIPTION OF THE INVENTION 

Preferred embodiments of the invention are able to decode a received DV 
audio stream based on analysis of a single DIF block rather than on an entire audio frame 
as per the prior art solutions. 
5 In the following description, the following formula is used: 

For Vm,« e N, or in other words, for any m, n is a Natural Number, and 

n = [n/m]*m + n%m (7) 

The constants Ci and C2 can be excluded from equations (1) - (6) without 
any loss of generality. The equations may therefore be re- written as follows, although the 
1 0 byte positions and sync block number are now offset. 

f, =[L«/3j+2*(«%3)]%r (CHI) (8) 

t 2 =[[«/3j+2*(n%3)]%r+r (CH2) (9) 

jj = 3 * («%3)+ ^n%K x )l K 2 J 
b =B*(n%K ) 



(10) 

(bytel) (11) 



(byte 2) (12) 



b\ -1 + 5*^%^) (byt e3 for4-ch (13) 



The various constants which were included in equations (1) to (6) can be 
excluded at this stage as they are invariant within a particular format of DV data {e.g. , 2- 
1 5 channel 525/60). The sync block number and byte positions are effectively offset to absorb 
Ci and C 2 . 

As all data bytes belonging to the same audio sample are distributed 
consecutively within the same DIF block, (from equations (1 1)-(13)), once the first byte in 
a sample is located, the other bytes may be easily located. The following derivation is for 
20 channel one and the first data byte only. The other bytes may be found as described from 
this information. 



Lt*%jg/* 2 j=* 2 d4) 

=>n%K x =K 2 x 2 +c (c€ZandO<c<K 2 ) (15) 

=>n = K^x l = K 2 x 2 +c (16) 

Where x, =\nl K x \=b^l B . (17) 

From equations ( 1 4) - ( 1 6) and equation ( 1 0), it can be seen that: 

x 2 =s { %3 (18) 

c%3 = w%3 (19) 

|_y3j=«%3 (20) 

c%3 = L5,/3j. (21) 

5 Equation ( 1 6) then yields : 

L«/3j = )/3j+ [(K 2 x 2 )/3j+ Lc/3j (22) 



'i =[L"/3j+2*(«%3)]%r 

=>Lw/3j=m*r + /,-2* («%3)1 

Equation (l4) j 

=>Lc/3j=w*7' + / 1 -2*(c%3)=m*7; + ^-2*(5 I /3) 

, where m* = m - \_{K x x, )/ 3_|/ T - l{K 2 x 2 )/ 2>\lT. 
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In order to evaluate m, the constraints of the various parameters may be 



used as follows: 



- (T - 1) < m'T < (T + 3) => m'e {0,l} 

m ' * T + r, _ 2 * [s { 1 3 J = \c 1 3 J => 0 < m'*T + 1 { - 2 * / 3 J < T 

=> |if(r,-2*LV3j)<0,m , = l 
[else if (r, - 2 * LV 3 J > 0, m' = o) 



(23) 



5 Equations ( 1 6) - ( 1 8), (22) and (23) can then be used to define the reverse 

mapping, f 1 as: 



= ATjXj + K 2 x 2 + c 

= K } (b { IB)+K z [s x %3)+ (m'*T + r, - 2 * [j, / 3 J* 3 + / 3 J 



(24) 



where 



tf(fj = 2*^/3 J< Cm' =1 
elseif(r 1 ~2*^ 1 /3J>0,m , =0 



Figure 5 illustrates the 525/60 system. It is apparent that suitable changes 
10 may be made in order to adapt the process for other previously mentioned systems such as 
625/50. 

The process illustrated in Figure 5 includes the following steps: 

1 . The explicit de-shuffling expression is determined from the shuffling 

equations. This process to find f 1 from f has already been described, and is performed off- 
1 5 line, i.e., it is not necessary to perform the operation in real-time as it may be performed in 

advance of receipt of the DV data stream. 

2 One DIF block ata time is read from the external data stream. The 

indices of the DV data (ti, si, bi) are used as the input arguments to the f 1 process. This 
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t) 

allows the position n of the appropriate byte in the PCM data to be determined. The same 
value n is then also used for the subsequent (B-l) byte(s). 

3. If the system is operating 2-channel mode, then PCM[n] =DV(ti, Si, 
bi) © DV (ti, Si, bi+1). If the system is not operating in 2-channel mode, then PCM[n] = 

5 DV(ti, si, bi) 0 DV(ti, si, bi+1) © DV(t l5 s h bi+2). 

4. Steps 2 and 3 above are repeated until all the DIF blocks in the 
received DV audio frame are de-shuffled. 

5. Post process the de-shuffled data, if necessary, and output as a PCM 

frame. 

1 0 A preferred method of performing the de-shuffling operation is to use a 

suitably programmed DSP (Digital Signal Processor). A single DIF block may be fetched 
from an external memory to an internal memory of the DSP. The DIF block includes 
system specific information from which the constants Ki, K2, T and B may be determined. 
These constants are used in the subsequent processing. 

1 5 For the first DIF block of a new frame, the Sync block number S\ , track 

number ti, and the DIF block counter are reset to zero. Whenever a new DIF block is 
received, s\ is incremented by 1, and is reset to zero every nine DIF blocks. Then ti is 
incremented by 1 every nine DIF blocks. Each received DIF block includes 72 data bytes 
which correspond to 72/B samples. 

20 The shuffling equations reveal that individual data bytes belonging to the 

same data sample are distributed consecutively in the same DIF block. Making use of this 
fact, equation 24 is applied to only the first byte of each sample. This first byte, together 
with the B-l bytes which follow it are used to determine the PCM sample with index n 
calculated by the de-shuffling equations. 

25 The pointer to the DIF block data is then incremented by B so that it points 

to the first byte of the next sample. When all the DIF blocks in a DV frame have been 
processed as described, the desired number of samples which have been stored in the PCM 
buffer are written to the external memory, as shown in Figure 6. 
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In contrast to prior art decoding systems, therefore, embodiments of the 
present invention do not require an entire DV audio frame to be received before the 
decoding process can begin. Also, them is no need to prepare and store a large look up 
table, saving the overhead of providing relatively large amounts of memory. 
5 Many embodiments of the invention, using the explicit reverse mapping 

relationships described previously, are able to directly compile PCM data from incoming 
DV audio data, requiring only a single DIF block at any one time. The indices ti, Si, bi are 
all that is required to determine the position of the data in the original PCM frame. 

The following table shows the reduction in memory which can be achieved 
10 through use of embodiments of the invention with different video standards. 



System 


Conventional 
Method (Entire DV 
Frame Basis) 


Embodiments of the 
Invention (DIF 
Block Basis) 


Memory Reduction 
Factor 


NTSC 


10*9 DIF blocks 
= 10*9*80 bytes 
= 7200 bytes 


1 DIF block 
= 80 bytes 


90 


PAL 


12*9 DIF blocks 
= 12*9*80 bytes 
= 8640 bytes 


1 DIF block 
= 80 bytes 


108 



The following table illustrates the reduction in different processing 
operations which can be achieved through use of embodiments of the invention. 

15 



Operation 


Conventional 
Method (Entire DV 
Frame Basis) 


Embodiments of the 
Invention (DIF 
Block Basis) 


Reduction Factor 


Modular 
Operation 


3/sample 


1 /sample 


67% 


Division 


3/sample 


2/sample 


33% 



It can be see that embodiments of the invention are able to provide decoding 
of DV audio data using significantly less physical memory, and requiring significantly 
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fewer processing operations to achieve the same resultant data as can be achieved by prior 
art solutions. 

All of the above U.S. patents, U.S. patent application publications, U.S. 
patent applications, foreign patents, foreign patent applications and non-patent publications 
referred to in this specification and/or listed in the Application Data Sheetare incorporated 
herein by reference, in their entirety. 

In the light of the foregoing description, it will be clear to the skilled person 
that various modifications may be made within the scope of the invention. 

The present invention includes any novel feature or combination of features 
disclosed herein either explicitly or any generalization thereof irrespective of whether or 
not it relates to the claimed invention or mitigates any or all of the problems addressed. 
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